Quantum Knowledge in Phase Space

Quantum physics through the lens of Bayesian statistics considers probability to be a degree of belief and subjective. A Bayesian derivation of the probability density function in phase space is presented. Then, a Kullback–Liebler divergence in phase space is introduced to define interference and entanglement. Comparisons between each of these two quantities and the entropy are made. A brief presentation of entanglement in phase space to the spin degree of freedom and an extension to mixed states completes the work.


Introduction
This journal issue celebrates Claude Shannon's 1948 formulation of "lost information" in phone-line signals [1].It is curious that when von Neumann asked Shannon how he was getting on with their information theory, Shannon replied (according to [2]) "The theory was in excellent shape, except that he needed a good name for "missing information"."Why don't you call it entropy", von Neumann suggested."In the first place, a mathematical development very much like yours already exists in Boltzmann's statistical mechanics, and in the second place, no one understands entropy very well, so in any discussion you will be in a position of advantage".
A quantification of entanglement and interference in quantum phase space for pure states is proposed through Shannon entropy and related concepts (see Appendix B for a brief review).In order to arrive there a phase space probability density must be derived and leads to the next topic that has captivated much of quantum physics discussions since its first days, namely the role of measurement in physics.Measurements are associated with events in statistics, since through measurements full knowledge of a physical variable is acquired.The role of knowledge in quantum physics is then visited.

Bayesian Knowledge in Quantum Physics
In the field of statistics two views offered by the Bayesian [3] and the frequentist [4,5] divide the experts.Bayesian thinking is based on the idea that probabilities are degrees of belief about the events while the frequentist approach is based on the idea that the probability of an event occurring is equal to the long-run frequency with which that event occurs.

Measurements and Knowledge
A measurement in a statistical theory is modeled as an event, since full knowledge of the variable being measured is acquired.
In quantum physics the outcome of a measurement is associated with eigenvalues of a chosen Hermitian operator.For example, if one measures the position or the spin along the z-direction of a state, such a position or spin value becomes an event and the state becomes the eigenstate associated with the measured eigenvalue.
In both classical probability and quantum physics, there are scenarios where it is possible to infer full knowledge of a variable without the direct measurement of such a variable.A simple example in classical probability starts with two balls in a bag, one is red and the other is white.By randomly drawing one of them and upon the knowledge of the outcome, one can infer the other ball color immediately without any further measurement.The prior knowledge that there were two different color balls in the bag is combined with the evidence from the observation to give certainty, immediately, on the other ball's color.This knowledge acquired about the other ball is not causal since if another player is only told the color of the ball that was left in the bag, the player would immediately infer the color of the ball that was drawn.Reasoning with knowledge in this case is invariant with respect to the time of occurrence.In quantum physics entangled states offer an analogous scenario.In teleportation experiments [6][7][8][9] with entangled fermions, the spin z-direction of one fermion can immediately be inferred by measuring the spin z-direction component of the other fermion.The prior knowledge that both fermions must have opposite spin combined with the observation allows for such inference.However, such knowledge is not causal, it is acquired by combining the prior knowledge with an observation, like in the classical probability example.
The Bayesian view of quantum physics also says that knowledge associated with a state is subjective to the observer.For example, two observers conducting an experiment of quantum teleportation, even if they have the same prior knowledge of the set up, will have different knowledge about the experiment according to when and where a measurement is obtained.According to the special theory of relativity, information or knowledge cannot be transferred instantaneously and so observers of an entangled pair will possess different knowledge of the variables at the time one of them is measured and so their predictions about the outcome of the other variable will differ, e.g., see [10,11].These experiments with the entangled pair traveling at long distances suggest that quantum physics is best described as a Bayesian theory.It is worth stressing that a theory of knowledge is not necessarily a theory of cause and consequence.
One may wonder, is there a causal explanation for the quantum phenomena?Revisiting the Bohr vs. Einstein debate, e.g., see [12,13] and references, the view put forward here follows the epistemological view of Bohr: quantum theory is today the best model for predictions.The view put forward here also resonates with the ontological concern of Einstein adding the question: "Is there a causal theory that accounts for the quanta phenomena?"The quest here is for a better understanding of quantum theory as a Bayesian theory and the use of Shannon entropy and related concepts to characterize interference and entanglement.The quest for a causal model is left open.
Here, knowledge and information are meant to be the same thing.There is much work in information in quantum physics, e.g., [14][15][16][17] and references, but they use von Neumann mixed states entropy as a starting point which attributes zero information content for all pure states.In contrast, the starting point and focus here is pure states, the core of quantum physics theory.Mixed states do follow from pure states.The emphasis in using the term knowledge or degree of belief for the probabilities instead of information is that it is the language used in Bayesian theory and it stresses that it is subjective.Yet, the Bayesian view provides all the predictions quantum theory can make today and perhaps one can expand it as discussed next.

Phase Space
From a statistical perspective, randomness associated with a quantum state cannot be fully captured by one operator, afterwards an eigenstate of any one operator will seem to contain no randomness, a measurement by this operator will result with certainty in the eigenvalue of this eigenstate.
A unique aspect of quantum physics, expressed by the uncertainty principle, is that eigenvalues of two non-commuting operators cannot be measured simultaneously.Events cannot occur for such pair of variables simultaneously.These are mutually exclusive quantum variables and so, randomness associated with a state cannot be reduced to zero.To characterize the randomness associated with a state, non-commuting operators are then needed.A special operator is x with eigenstates |x .Together with the unitary evolution operator, they define the space-time properties of states.In quantum mechanics, the uncertainty principle associated with the operator x is derived from the non-commuting property [x, p] = ih, where p is the momentum operator.The variables x and k = p h are Fourier of each other, where k is the spatial frequency variable.The randomness of a state |Ψ t is then captured in phase space, the space formed by the pair of Fourier variables x, k.
In quantum field theory, a relativistic theory, space becomes the domain variable, not an operator, and the spatial frequency k becomes the Fourier domain variable [18].The phase space is then the space of the two domains of the quantum fields, one being the Fourier of the other.In this case, quantum field operators are rooted on the creation and annihilation operators defined in these two domains.A creation operator can then create a particle at some position or with some spatial frequency.The coefficients in front of these operators replace the role of the wave functions establishing the distribution in phase space.Phase space becomes the space where all randomness of a state is captured in quantum field theory.
In probability theory, any statistics of interest is derived from the joint distribution of all variables, so the joint distribution in phase space is the quantity to be derived.A constraint to events in phase space occurs from the fact that ψ(x, t) = x|Ψ t and φ(k, t) = k|Ψ t are the Fourier transform of each other.If one acquires full knowledge of one of the variables at time t, expressed by a Dirac delta distribution, then the other phase space variable must be described by a uniform distribution, indicating maximum entropy in this other variable.It is this constraint of our knowledge about the pair of variables x and k at time t that yields the uncertainty principle as clearly formulated by Robertson [19].The uncertainty principle suggests that events do occur in a volume element in phase space of size ∆x∆k ≥ 1, forming a coarse representation of the phase space.Thus, with respect to the statistical mechanical view of Gibbs [20], quantum mechanics already have a coarse mechanism built in to describe events in phase space.
Early attempts to create a quantum probability distribution in phase space by Wigner [21] and by Husimi [22] ended up with pseudo-distributions that fail Kolmogorov probability axioms and also have consistency difficulties with special relativity.Thus, the need for a pursue of a new approach.
With a new approach to create a phase space probability density, to be developed in Section 2 capturing knowledge about the phase space variables, we further exploit how such knowledge can be used to characterize quantum states entanglement and interference.

Entropy, Interference, and Entanglement
The previous work [23,24] shows how Shannon entropy of a quantum state in phase space captures the loss of knowledge or the loss of information such a state describes.That work explores the hypothesis that knowledge (or information) cannot be gained in a closed quantum system to account for the time arrow.Here, the objective is to quantify interference and entanglement in terms of information loss or gained.
The Kullback-Liebler divergence (reviewed in Appendix B) is employed to define interference as a loss of information if one replaces the state probability density in phase space by a "classical probability density" in phase space.The Kullback-Liebler divergence is also employed to define entanglement as a loss of information if one replaces a state probability density in phase space by a product of state probability density in phase space.Such quantification of interference and entanglement could help our understanding of physical system evolution, for example by restricting which physical phenomena are allowed according to the gain or loss of interference or entanglement.
Position and spin are degrees of freedom (DoFs) required to specify a quantum state.This paper addresses how knowledge in phase space is quantified for position and spin DOFs.

Paper Organization
The remainder of the paper is organized as follows.In Section 2, the Bayesian formulation of probability density in position-momentum phase space is developed.In Section 3, the quantification of interference in phase space is proposed and compared to the entropy in phase space.In Section 4, the quantification of entanglement in phase space is proposed and compared to the entropy in phase space.Section 5 expands the concept of phase space to spin systems and proposes the quantification of entanglement for spin systems.Such quantification can also be expanded to Qbit technology.We also briefly show the approach to mixed states and compare it to von Neumann entropy.Lastly, concluding remarks are provided in Section 6.

A Bayesian View of Quantum Phase Space
One question immediately arises, what is the meaning of a joint distribution in phase space given that events in phase space cannot occur?
For quantum physics to be a statistical theory, a joint distribution for two variables of non-commuting operators must not require an event and a conditional distribution for the same variables must not require an event to be given.Instead, the joint distribution will describe the degree of belief about the joint variables and the conditional distribution will describe the degree of belief about a variable given the degree of belief about the other variable.An event associated with any one, but not both, of the variables of the non-commuting operators is still possible.
Quantum physics, as a statistical theory, is best described through the states so that classical logical manipulations such as or and and become additions and products of states analogous to operations in classical probability theory.For example, in the double slit experiment, according to classical logic an electron can pass through slit 1 or slit 2, so the quantum state at slit 1 will add with the quantum state at slit 2 to form the final state.To represent a particle A and a particle B, one takes the product of the two quantum states (a state in a product of Hilbert Spaces).After the operations with the state occur, quantum probabilities are then associated with the final state |Ψ t via the probability density matrix With this view the following proposition and theorem follows Lemma 1 (Conditional Probability Density).Given the projection of a state to the position basis, ψ t (x) = x|Ψ t , then the conditional probability density function in the spatial frequency domain is ρ(k|ψ t (x)) = | k|Ψ t | 2 .Also, given φ t (k) = k|Ψ t , then the conditional probability density function in the position domain is Proof.Given the wave function ψ t (x) = x|Ψ t , we can derive via the inverse Fourier transform the conditional wave function in spatial frequency Clearly, φ t (k|ψ t (x)) = dx k|x x|Ψ t = k|Ψ t , where k|x = 1 √ 2π e −ikx .Thus, Similarly, starting from φ t (k) = k|Ψ t and applying the Fourier transform followed by the magnitude square one obtain ρ(x|φ The conditional distribution ρ(k|ψ t (x)), in general, depends on the entire function ψ(x) and not on any one event in x.Note that conditional probabilities do not necessarily describe a cause and consequence relation but rather they are knowledge or information relation.
Theorem 1 (Joint Distribution in Phase Space).Given a state |Ψ t , evolving in time according to some Hamiltonian.Then, the joint distribution in phase space is ρ , where ψ t (x) = x|Ψ t and φ t (k) = k|Ψ t are the projection in position basis and spatial frequency basis.
Proof.Considering the state projected in the position domain to be a prior known function ψ t (x) = x|Ψ t we obtain the conditional projection of the state in the spatial frequency domain from Lemma 1 to be φ t (k|ψ t (x)) = k|Ψ t .We then have the conditional density as From Bayes' theorem applied to the density we have Clearly, we could have started with the state prior φ t (k) = k|Ψ t and obtained via the Fourier transform the (conditional) projection of the state in position ψ t (x|φ t (k)) = x|Ψ t and then obtained the same joint distribution.
In quantum statistics, the joint distribution is not to be interpreted as the product of two independent random variables since the two distributions ρ x t and ρ k t are not independent from each other, and independent events cannot occur simultaneously.

Entropy in Phase Space
Geiger and Kedem [23] proposed a quantification of knowledge of a quantum state through Shannon entropy associated with the phase space distribution, namely They showed various desired properties of this entropy, including it to be invariant to canonical transformations, special relativity and to CPT transformations.We adopt this entropy here for the study of interference and entanglement.

Interference
Given two states |Ψ A and |Ψ B and a general superposition of these two states |Ψ A or B as in Equation (A2).The projections of |Ψ A or B written in polar representation are where |.| is the magnitude value, ξ A,B (x) are the complex phases associated to the wave functions ψ A,B (x) and similarly χ A,B (k) are the complex phases associated to the wave functions φ A,B (k).
The probability densities are then where the normalization constants are , and are probability densities without the interference terms.The upper-index c refers to these probability densities also representing classical statistical combination (weighted average) of probability densities associated to the quantum states A and B.
Definition 1 (Interference:).Given two states |Ψ A , |Ψ B and their linear superposition |Ψ A or B as in Equation (A2).Interference, I, is the amount of information lost when ρ c (x, k) = p c (x)q c (k) is used to approximate ρ(x, k) = p(x)q(k).It is calculated via the Kullback-Liebler divergence between the phase space probability densities ρ(x, k) and ρ c (x, k), i.e., where CrossS(p, q) is the cross entropy between probability distributions p and q (see Equation (A8)) and the phase space entropy Equation (2) for the phase space distributions Equation ( 5) is given by As one varies the combination of the two states, the larger I is, the larger the interference contribution to the distribution ρ(x, k) = p(x)q(k).
There is no interference, i.e., I = 0, when 1. the functions' support in phase space do not overlap, i.e., the complex phases are aligned up to a constant multiple of π 2 , i.e., either θ 1 = 0, π 2 , since then there is no superposition of states.This will effectively occur when ψ A (x) = ψ B (x).
Also, IFF there is no interference, P(x) = p c (x) and Q(x) = q c (x). Figure 1c,f,i, illustrates scenarios with each state being a coherent state and not overlapping in neither position nor spatial frequency.
Clearly, one can consider the interference just in position representation or just in spatial frequency representation.However, here, the quantification of the interference in phase space distinguish the case (a) when a projection of superposition of two states in position space does not interfere but the same superposition projection in spatial frequency does interfere, from the case (b) a superposition of two states that do not interfere neither in position nor in spatial frequency.
Figure 1 illustrate some scenarios of two coherent states that can be superposed to investigate inference as discussed next.Figure 2 illustrates some scenarios comparing the Kullback-Liebler divergence (KLD) (7) and the entropy (8).The entropy captures the notion of overlap of the superposition of states.For example, when both states are similar and superposed, the classical addition of probability densities and the quantum superposition do occur.Then, the KLD and the entropy will be small.However, for cases where there is no overlap between the two states, the classical weighted average distribution is a good approximation to the quantum one, the KLD will be small, but the entropy will be large.Both concepts may be helpful to characterize the knowledge one has about the superposition of states.The KLD captures the distinction between classical probability and quantum probability, while the entropy captures the concept of lack of overlap of two states in quantum phase space.One advantage of the entropy over the KLD is that one does not need to know the components of the superposition of states to evaluate the entropy.
The role of the phase differences is noticeable, in position and in spatial frequency, as per (10).In addition, for coherent states, used in the simulations, the difference in phase of the state projection in position basis is the differences in the centers of the state projection in the spatial frequency basis.Also vice versa, the phase difference in spatial frequency is the center difference in the position domain.The periodic range for ∆µ is reduced for Figure 2c,d.creating the oscillations in the KLD and entropy.Entropy seems to be a good estimation for the interference behavior when the two states overlap either in spatial frequency or in position.However, the more the overlap in both spaces is reduced the more the two quantities differ in behavior., where δµ ≈ δk ≈ 0. However, the KLD becomes small as the states do not overlap while the entropy gets to be larger.As the phase increases from (a,b) to (c,d) oscillation increases for both (KLD and Entropy) as periods reduce.Entropy seems to be a good estimation for the interference behavior when the two states overlap either in spatial frequency or in position.However, the more the overlap in both spaces is reduced the more the two quantities differ in behavior.

Entanglement
Given two states |Ψ A and |Ψ B consider the entangled state |Ψ A and B as described in Equation (A3).Extending the language of probability to the quantum states, each state represent a random object.A quantum Bayesian interpretation of this state is that it reflects a joint state of |Ψ A and |Ψ B , thus the sub-index A and B. Considering the prior to be the state |Ψ A , then the conditional state-operator, which is the product of a state in one Hilbert space times an operator acting in another Hilbert space, is where I is the identity operator acting in the other Hilbert space.
Clearly, we would have obtained the same joint state had we considered the prior to be |Ψ B and the conditional state-operator to be |Ψ The probability density in position space of the joint state is then Similarly, the probability density q(k x , k y ) in spatial frequency space is where θ 2 = π 4 and ϕ 2 = 0, π describe bosons and fermions, respectively.However, in various empirical works, specially related to quantum computers, one can trap fermions on different locations so that they do not occupy the same state, and still entangle their spin with any general set of parameters.Thus, in theory, one could prepare two fermions to have different spin states and allow them to combine in phase space freely, described by a general set of parameters above.Expanding this formalism to Qbits there are no restrictions on the set of parameters used.
The phase space entropy (2) for this joint state becomes S(|Ψ 2 ) = − dx dk x dy dk y p(x, y) q(k x , k y ) log p(x, y), q(k x , k y ) .( 16) The product of states, or the disentangled states, are described by the two-state ( 12) with θ 2 = 0, π 2 , i.e., where the upper index D1 and D2 indicate two different disentangle states.
Proposition 1.Given two states |Ψ A , |Ψ B and the two-state |Ψ A and B , shown in (12).Refer by p D (x, y)q D (k x , k y ) to any of the disentangled states (17).The Kullback-Liebler divergence D KL p(x, y)q(k x , k y )|| p D (x, y)q D (k x , k y ) is invariant to any choice of disentangled states for bosons or fermions, where θ 2 = π 4 and ϕ 2 = 0, π, respectively.
Proof.The proposition follows from performing the decomposition of the logarithm of products into the sum of logarithms and then using the symmetric properties of fermions and bosons.More precisely, D D KL = −S(p, q) + dx dy dk x dk y p(x, y)q(k x , k y ) log p D (x, y)q D (k x , k y ) (18) where D D KL is short for D D KL p(x, y)q(k x , k y )|| p D (x, y)q D (k x , k y ) .Then, for D1 we get where p(x) = dy p(x, y), p(y) = dx p(x, y), q(k x ) = dk y q(k x , k y ), q(k y ) = dk x q(k x , k y ).
Note that the integrals yield the same functions p(.) and q(.) due to the symmetric properties for bosons and fermions.
Similarly, for D2 we get and clearly every term here has a perfect match in ( 19), e.g., dx p(x Definition 2 (Entanglement:).Given two states |Ψ A , |Ψ B and the two-state |Ψ A and B , shown in (12), that when projected in phase space yields the probability density distribution ρ(x, y, k x , k y ) = p(x, y)q(k x , k y ) given by ( 14) and (15).Entanglement, E , is the amount of information lost when the product of states is used to approximate ρ(x, y, k x , k y ).More formally, for bosons or fermions, where θ 2 = π 4 and ϕ 2 = 0, π we have = CrossS(p(x, y)q(k x , k y ), p D (x, y)q D (k x , k y )) − S(p(x, y)q(k x , k y )) and when the parameters θ 2 and ϕ 2 are free to vary (as in Qbits) then The entanglement vanishes when 1.
One comparison of interest is between the entanglement (21) and the entropy (16).Figure 3 illustrates some scenarios combining two coherent states where these two quantities are evaluated and a comparisons is made.While the definition of entanglement is through the KLD, the entropy captures a similar behavior and it can be evaluated from the state itself, without having to know what the product of the states would be.Entropy behavior seems to be a good estimation for entanglement.One reason is that the basis functions for a product of N Hilbert spaces of the total operators X = x 1 × x 2 . . .× x N and K is given by the product of states |x 1 . . .|x N and |k 1 . . .|k N .These are the eigenstates of the operators x 1 . . .x N and k 1 . . .k N .Then the product of coherent quantum states, where each one minimizes the phase space entropy, projected in these bases have lower entropy than the entangled states, which are linear superposition of these products.
pi 4 is fixed when entangling the two states.Cases (a-d) show KLD and Entropy, respectively, for a symmetric entanglement where phase ϕ 2 = 0. Cases (e,f) show KLD and Entropy, respectively, for an anti-symmetric entanglement where phase ϕ 2 = π.The effect of the phase ϕ 2 is only noticeable when the two states are very similar to each other and then both, KLD and entropy, yield large values for the anti-symmetric case (after all anti-symmetric functions must vanish in these cases, while the product of states does not).While the KLD has a smoother behavior, both increase as the separation of the two coherent state parameters increases.The larger values of the phase parameters in (c-f) clearly cause a periodic behavior.Entropy behavior seems to be a good estimation for entanglement.
The study of the more general case of entanglement (22), which is applicable to Qbits, is left as future work.

Entanglement for Spin or Qbit Phase Space
The degrees of freedom (DoFs) in quantum physics specify the wave function (projection of the state in position space) and the spin.Thus, when quantifying our knowledge of a quantum system one must also quantify our knowledge of a spin state.Qbits are like spin in formalism, but with less constraints as the Pauli exclusion principle is no longer required (since other aspects of the complete state may already be identified or are already anti-symmetric).What follows also applies to Qbits.
Let us consider two spin states |ξ A and |ξ B , each formed with N A and N B spin s particles, respectively.A spin state formed from these two states may be in a superposition of any of the total spin magnitudes s ∈ [s N , s N − 1, . . ., 1  2 mod (2 s N , 2)], where s N = Ns and N = N A + N B , and is written as The operator S z associated with |ξ is given by where I N is the identity of dimension N and ⊗ is the exterior product.
As discussed earlier, the position and spatial frequency operators for a product of N Hilbert spaces is the product of position operators in each Hilbert space.For spin states, the total S z operator (24) is not just the product of single particle operator.The set of eigenstates of the operators S z and S 2 include entangled spin states.Let us refer to the eigenstates of S z and S 2 as {|ξ s,m s ; s = s N , s N − 1, . . .s min , m s = −s, . . ., s}, where s min = 1 2 mod (2 * s N , 2).Thus, the state is written in this basis as where α s,m = ξ s,m |ξ ∈ C and 1 = ∑ s N s=s min ∑ s m=−s |α s,m | 2 .The phase space for the spin associated with the operators S z and S 2 is derived from quantizing the sphere, the surface of the ball with a radius of the spin magnitude h s N , as developed by the Geometric Quantization (GQ) method, e.g., see [25][26][27].Geiger and Kedem [24] have proposed this approach to evaluate the entropy of a quantum state which is briefly summarized next.
The conjugate basis to {|ξ s,m } is {|φ ; [0, 2π]}, obtained by identifying the angle φ, the rotation angle around the z-axis of e −iS z φ , as the conjugate operator to S z .The spin state |ξ in this basis is where and The two solutions in (28) are periodic in φ and differ by a phase (gauge) transformation of e −i2sφ .
Thus, the entropy (2) of a spin state |ξ in spin phase space is The first term is the Shannon entropy capturing the randomness of the spin value along the z-axis.The second term is differential entropy capturing the randomness of the spin value in the plane perpendicular to the z-axis, i.e., the entropy of the polarization angle φ.Geiger and Kedem [24] have shown that this entropy reaches its lowest value zero for the eigenstates of the two operators S z and S 2 .Products of states that are described by the superposition of these eigenstates will have higher entropy.The densities in spin phase space associated with the product state |ξ A |ξ B are derived from the projections Extending the work of [24] to also define the Kullback-Liebler divergence between a joint state as described by (12) and the product of the states |ξ A |ξ B .Definition 3 (Spin Entanglement:).Given two spin states |ξ A , |ξ B and the joint state |ξ , shown in (23), that when projected in spin phase space yields the probability density distribution ρ s,m (φ) = |α s,m | 2 |λ(φ)| 2 .Spin entanglement, sE , is the amount of information lost when the product of states is used to approximate ρ s,m (φ).More precisely, One comparison of interest to be made is between spin entanglement (31) and spin entropy (29).In contrast to the position × spatial frequency phase space entanglement, the spin entropy will be minimized and attain value zero for some entangled states that are eigenstates of the total spin operators, such as it is the case of Bell entangled states for two particles [24].The product of states will then be described by the superposition of these entangled states and thus, will not have a well-defined spin magnitude.Their entropy will be larger.The KLD will then anti-correlate with the entropy.A more detailed study is left for the future.

Expansion to Mixed States
One extension of this approach to mixed states starts from the density matrix derived from the general state (23), namely Then, tracing out the density matrix (and assuming the states |ξ A , |ξ B to be in orthogonal Hilbert spaces to each other) Then, von Neumann entropy is the Shannon entropy of this mixed state.While this is of interest to much research, Shannon entropy of pure states precedes von Neumann entropy.Also, von Neumann entropy obtained by tracing out pure states has some similarity to the entropy of the superposition of two states Equation (A2), but neglecting both quantities the interference and the conjugate variable of the phase space.Geiger and Kedem [23] showed invariant properties of the Shannon entropy in phase space that von Neumann entropy would not have.

Conclusions
The Bayesian statistic view was developed to construct the phase space probability density.The Bayesian approach to describe quantum physics in phase space considered the conditional probability densities and joint probability densities to be descriptions of the degree of belief about the phase space variables.It was observed that events representing measurements in the phase space do not occur.However, in a coarse description of the phase space, where the elementary volumes satisfy the uncertainty principle, they do occur.This derivation of the phase space density provides further support to the work of Geiger and Kedem [23,24] where they developed the entropy in phase space.
As a Bayesian theory, quantum physics is subjective, even if the prior knowledge is common to different observers, the measurements may not be.The entanglement scenarios where teleportation experiments have been reported, demonstrate the subjectivity of quantum physics as different observers make different predictions about the outcome of a variable.Moreover, the Bayesian view of quantum physics is that the theory is not a complete causal theory.The time evolution of a state via the unitary evolution is causal, as the Hamiltonian causes the state to evolve.However, the Bayesian combination of observation and the prior knowledge is not causal.
The Bayesian theory of quantum physics also considered the probabilistic object of manipulations to be the state and not the probabilities.Addition and multiplication operations usually associated with or and and logical operations were applied to quantum states and not to quantum probabilities.
With the phase space probability density constructed, the next objective was to quantify interference and entanglement in terms of information loss or gain, using the Kullback-Liebler divergence (rooted on Shannon entropy).A comparison to the Shannon entropy of the state and some similarities between the two quantities were revealed.One advantage of entropy is that it can be inferred from the quantum state, without any reference to the two states that were used to compute the divergence.It was noted that for spin phase space the relations between entropy and the Kullback-Liebler diverges will differ since the eigenstates of the total spin operator become entangled states.
Extrapolating the Bayesian approach to a philosophical interpretation, and revisiting the Bohr vs. Einstein debate, the view put forward here is similar to the epistemological view of Bohr: quantum theory is today the best model for predictions.The view put forward here also resonates with the ontological concern of Einstein adding the question: "Is quantum theory a causal theory?"If not, 'is there a causal theory to be discovered?"After all, by adopting the Bayesian view, the question of a complete causal theory was left open.The Kullback-Liebler divergence between two probability distributions p(x) and q(x) is given by D KL (p||q) = − p(x) log q(x) p(x) dµ(x) = CrossS(p, q) − S(p).(A9)

Figure 1 .
Figure 1.Normal distribution in phase space for two coherent states, A and B in 1D with centers and variances as follows.For all (a-c) position space probabilities, with µ A = 27 with σ A = 5.6 and for (a) µ B = 27, (b) µ B = 37, (c) µ B ≈ 48, all with σ B = 4.2 .Note that for each coherent state, the spatial frequency value is the phase of the coherent state in position space.For (d-f), spatial frequency space probabilities, with k A ≈ 0.35 and (d) k B ≈ 0.35, (e) k B ≈ 0.75, (f) k B ≈ 1.15.For (g-i), spatial frequency space probabilities, with k A ≈ 1.04 and (g) k B ≈ 1.04, (h) k B ≈ 1.55, (i) k B ≈ 2.05.

Figure 2 .
Figure 2. Interference Simulations for a superposition of two coherent states as shown in Figure 1.The coherent state |Ψ A has µ A = 27, σ A = 5.6, and for (a,b) the phase is k A = 0.35 while for (c,d), the phase is k A = 1.04.The other coherent state |Ψ B in position has fixed σ B = 4.2 and the position center and phase vary in 48 increments each, as follows: µ B = [27, . . ., 48], and for (a,b), the phase varies as k B = [0.35, . . ., 1.15], while for (c,d), the phase varies as k B = [1.04, . . ., 2.05].The plots axis are all with ∆µ = µ B − µ A vs. ∆k = k B − k A .The KLD and the entropy become small as the two states closely overlap, i.e., where δµ ≈ δk ≈ 0. However, the KLD becomes small as the states do not overlap while the entropy gets to be larger.As the phase increases from (a,b) to (c,d) oscillation increases for both (KLD and Entropy) as periods reduce.Entropy seems to be a good estimation for the interference behavior when the two states overlap either in spatial frequency or in position.However, the more the overlap in both spaces is reduced the more the two quantities differ in behavior.

Figure 3 .
Figure 3. Entanglement simulations from two coherent states shown in Figure 1, with normal probability distributions.Note that the phase of the coherent state projected to position space is the center of the coherent state projected in the spatial frequency space, and vice versa.The coherent state |Ψ A has a fixed set of parameters, µ A = 27, σ A = 5.6 in position space, and for (a,b) the phase is k A = 0.35 while for (c-f) the phase is k A = 1.04.The coherent state |Ψ B in position has fixed σ B = 4.2 and the center and phase vary in 48 increments each, as follows: µ B ∈ [27, . . ., 48], and for (a,b) k B ∈ [0.35, . . ., 1.15], while for (c-f) the phase varies as k B ∈ [1.04, . . ., 2.05].The parameter −s+m)φ , m < 0 (southern hemisphere) .

Funding:
This research received funding from National Science Foundation under Grant No. DMS-1439786 and the Simons Foundation Institute Grant Award ID 507536 while the author was in residence at the Institute for Computational and Experimental Research in Mathematics in Providence, RI, during the spring 2019 semester "Computer Vision" program.Informed Consent Statement: Not applicable.
The conditional stateoperator, with an abuse of state notation, reflects the impacts/change to a state |Ψ B given the knowledge of the state |Ψ A .This conditional state-operator leads to the joint state by acting on the prior state |Ψ A as follows |Ψ A and B = |Ψ cond-B/A |Ψ A = A |Ψ B + e iϕ 2 sin θ 2 |Ψ B |Ψ A .(12) associated with joint density operator ρ A and B = |Ψ A and B Ψ A and B